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In this work, we report a massively parallel and time domain implementation of the 3D phase 
field model that can reach beyond micron scale and consider for arbitrary electrical and mechanical 
boundary conditions. The first part of the paper describes the theory and the numerical implemen- 
tation of the model. A mixed-mode approach of finite difference (FD) and finite element (FEM) 
grid has been used for calculating the nonlocal electrostatic and elastic interactions respectively. All 
the local and non-local interactions are shown to scale linearly up to thousands of processors. This 
massive parallelization allows us to directly compare our results with multiple experiments at the 
same size scale. The second part of the paper presents results of ferroelectric switching in devices 
based on the multi-ferroic BiFe03(BF0). We have particularly emphasized the importance of charge 
driven domain growth and the effect of electrical boundary conditions that explain the temporal 
evolution of ferroelectric domains observed in recent experiments. We also predict a mechanism of 
controlling domain size in the multi-domain ferroelectric switching that could be useful for practical 
applications. 



INTRODUCTION 

Thin film devices incorporating ferro-electric and 
multi-ferroic materials have attracted substantial re- 
search effort worldwide [1 . Understanding switching 
dynamics in these multi-domain ferroelectric films influ- 
enced by arbitrary electrostatic and mechanical bound- 
ary condition remains to be a significant challenge. The 
origin of the difficulty lies in the coupling of multiple or- 
der parameters in these materials and the spatial asym- 
metry introduced by the domain walls. The necessity 
of being self consistent between various competing ener- 
gies coming from chemical, electrostatic and elastic ori- 
gin makes the temporal evolution of polarization a nu- 
merically stiff problem. Also ferroelectric domain walls 
are typically of the order of nm size whereas the whole 
pattern forms over micron sizes. This disparate length 
scales associated with ferroelectric domain walls and do- 
mains themselves necessitates a large degree of freedom 
to be simulated in order to achieve physically reasonable 
results. Under experimental conditions, the non-linear 
switching behavior makes it very difficult to design and 
characterize devices. Faced by these obstacles, it remains 
a significant challenge to make direct comparison of a 
simulation result with an experimental observation and 
also pursue rational device design using computational 
simulation. In this paper, we report a significant step for- 
ward by extending the capability of conventional phase 
field models [2 , extensively used for ferroelectric mate- 
rials, up to the micron scale where experiments are typ- 
ically performed. The distinct features of the model are 
that it can simulate structures up to micron size in 3D, 
take arbitrary electrical and mechanical boundary con- 
ditions and achieve linear scaling in calculating all the 
local and non-local electrical, mechanical and chemical 
interactions. Reaching the micron scale simulation grid 



allows us to make direct comparison with experimental 
observations in arbitrary device structures such as that 
shown in Fig. 1(a). 

In the following section, we describe the physical equa- 
tions, numerical implementation and performance. Us- 
ing this model, we simulated the lateral switching on 
various surfaces of the multiferroic material BFO. We 
make direct comparison with multiple experiments re- 
cently reported on various surfaces of BFO. Finally, we 
propose a method to control domain size during polar- 
ization switching that could be crucial for device appli- 
cations using this material. 



THEORY 

A phase field model describes the thermodynamic free 
energy of a system in terms of a continuous field variable. 
All the participating energies of the system is described 
as a function of this order parameter. For example, for 
ferroelectrics, the polarization is a suitable order param- 
eter for describing the thermodynamic energy of the sys- 
tem. A continuum approximation for describing the spa- 
tiotemporal variation of the polarization facilitates de- 
scribing the low energy dynamics of the system. Phase 
field model has been applied to understand ferroelectric 
domain switching [3HH] , strain effects [9HT7] and random 
defect effects [HI [19] . We describe the various aspects of 
the theory only briefly here. For a comprehensive review 
of the theory, see pioneering work by L. Q. Chen et al 
[2. 

The relevant contributors to the thermodynamic en- 
ergy of the ferroelectric system are the bulk energy, elec- 
trostatic energy, elastic energy and the domain wall en- 
ergy. The energy gained due to the phase transition from 
the paraelectric to the ferroelectric phase in a homoge- 
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neous unstrained ferroelectric is called the bulk energy 
and parameterized using the Landau coefficients. The 
bulk energy is given by 
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Here, are the polarization along the three (001) di- 
rections of the BFO crystal. The a's are the relevant 
Landau coefficients for different ferroelectrics. Note also 
that henceforth we will use i to denote crystallographic 
directions and / to denote the lab coordinate axis along 
which we will set up our numerical grid. Space direc- 
tions are denoted by x for crystal directions and X for 
grid directions. 

In finite ferroelectrics, electrostatic compatibility is ob- 
tained by breaking the film into ferroelectric domains. 
The variation of order parameter at the domain wall 
causes an energy cost that originates due to both strain 
and dipole-dipole interaction. This additional price in en- 
ergy is incorporated by the gradient of the polarization 
at the domain wall. The energy and the thermodynamic 
forces due to the domain walls are given by 
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Here, Pj j are the gradient of polarization along the grid 
directions. Gn is the domain wall energy coefficient 
when the grid and crystal directions coincide. We only 
consider the first order term here. Pj is the projection of 
the polarization along the grid direction /. 

In the phase field description, the inhomoge neous long 
range electrostatic interaction is taken into account by 
solving the 3D poisson equation with appropriate bound- 
ary condition. The electrostatic energy and field are 
given by 
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Here, E/ are the effective electric field along the grid 
directions. This field incorporates contributions due to 
inhomogeneous polarization. The contribution of the ap- 
plied field is added to the electrostatic potential assuming 
a linear dielectric. The depolarization field is calculated 
separately by summing over all the local contributions to 
the global polarization. 

The substrate constraint and domain variations within 
the ferroelectric film causes both homogeneous and in- 
homogeneous strain in the film. The effect of the elastic 
compatibility on the domain morphology is calculated by 



solving the stress-strain relation with thin film boundary 
condition. The elastic energy density is given by ^ 
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Here, Cijki is the elastic modulus, Cij is the total strain 
and e^j is the eigen strain and Cij is the elastic strain. 

Once the above thermodynamic energy contributions 
are included in the total energy expression, the thermo- 
dynamic driving force is calculated as the derivative of 
the total energy of the system with respect to the polar- 
ization. The subsequent temporal evolution of the polar- 
ization in 3 dimensions is calculated by using the time 
dependent Ginzberg- Landau equation. 
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Here, Ci{^^^) is the random force due to thermal fluctua- 
tion that has a zero mean and a gaussian variance. 

The bulk energy and the domain wall energy are lo- 
cal interactions and hence are candidate for direct paral- 
lelization over the distributed processors. The nonlocal 
interactions on the other hand are difficult to parallelize. 
The non-local interaction in ferroelectrics arise due to the 
inhomogeneous electrostatic and elastic field. The com- 
putation of these interactions will be described in the 
Numerical Implementation section. 



NUMERICAL IMPLEMENTATION 

We have implemented a time domain phase field model 
contrary to that described in[2(J. In general, the semi- 
implicit fourier-spectral method allows one to take signif- 
icantly longer time steps compared to a Forward-Euler 
method [20 . However, we employed a velocity verlet 
method that allowed us to take time steps significantly 
longer than the Forward-Euler method and reproduce the 
results predicted by the semi-implicit method [20 . Our 
specific motivation for pursuing a time domain imple- 
mentation is to take advantage of modern distributed 
computing architectures. We will show below that prob- 
lem sizes of N=10^ can be modeled very efficiently with 
a time domain implementation exploiting the paralleliza- 
tion achieved at every computation steps. This is a signif- 
icant step forward in terms of numerical capability com- 
pared to the state of the art phase-field modeling. Also, 
a time domain approach allows for easy and intuitive 
incorporation of electrostatic and mechanical boundary 
conditions and therefore predictive simulation of dynamic 
behavior can be performed. In our model, the simulation 
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grid consists of FEM and FD grid for elastic and electro- 
static calculations respectively. Below we describe the 
various aspects of the numerical implementation. 
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the respective FD node points. The change in free en- 
ergy is calculated due to the strain and hence the ther- 
modynamic force due to the elastic energy. A natural 
coordinate numbering was employed for the FEM calcu- 
lation. For a 3D brick element, each body node has 8 ele- 
ments connected to it and hence has a total of 26 element 
nearest neighbor nodes. Due to a significantly increased 
nearest neighbor in the FEM grid, compared to the FD 
grid, we parallelized the stiffness matrix in the natural 
coordinate numbering order. Special care was taken for 
minimizing nonlocal assignments to the stiffness matrix 
when assembling, as will be described in the FEM cal- 
culation section. Thus parallelizing the two grids in two 
different manners facilitates maximum efficiency for the 
respective problems. 



FIG. 1: (a) Schematic of a representative device structure 
that is used as a test case for the developed method. Here, 
a thin film of ferroelectric material is grown on a substrate. 
Two electrodes are placed in order to apply an in-plane elec- 
tric field. The electrostatic boundary condition on the mate- 
rial can be changed by using different materials on the ferroic 
thin film. The substrate strain can be varied by changing the 
substrate material with different lattice vectors, (b) The im- 
plemented numerical grid that contains both finite difference 
and finite element grids. The elements are a small block of 
linear brick element. The nodes of the block coincide with 
the FD grid. Both FEM and the FD grids are numbered in 
natural ordering. 



Grid 

We used a mixed grid for implementing the electro- 
static and elastic interactions. The elastic interaction 
calculation is done on a FEM grid and the electrostatic 
calculation is performed on a FD grid. The relationship 
between the FEM element and FD node numbering are 
shown in Fig. [ijb). Initially, polarizations are defined on 
a 3D FD grid. The bulk energy and domain wall energy 
are calculated on this grid. The nonlocal electrostatic 
field is calculated by solving the Poisson equation on the 
FD grid. The matrix was parallelized for a 3D FD sten- 
cil, so that maximum of the connected grid points are 
on the same processor. This way of parallelizing the FD 
stiffness matrix provides very fast matrix assembly per- 
formance and is implemented in the DA data structure 
of PETSC[21 . The spontaneous strain at every node is 
calculated from the polarizations [see equation 16, to be 
discussed later]. The body force due to the spontaneous 
strain is assigned at the nodes of the FEM grid element 
nodes. A finite element grid with linear brick element is 
then used to solve for the stress-strain relationship. Once 
the total strain is calculated by solving the stress-strain 
relation with FEM, it is assigned in a reverse manner to 



FEM calculation 

The FEM calculation involves 1) calculating the ele- 
ment stiffness matrices, 2) assembling the structure stiff- 
ness matrix, 3) applying mechanical boundary condi- 
tions, 4) calculating the body forces originating from 
plastic strain and solve for the total strain. We describe 
the FEM calculation procedure below. 



Element Stiffness Matrix 

We used an iso-parametric linear brick element for im- 
plementing the FEM calculation of the inhomogeneous 
strain. An iso-parametric implementation of the elements 
allows for Gaussian Integration during element stiffness 
calculation and thus facilitates faster assembly. The pro- 
cedure is as follows. First the iso-parametric element is 
written in natural coordinates. Then the shape function 
is calculated in terms of the natural coordinates of the 
element nodes. 
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Here, P and t] are the axes in natural coordinates. 

are the components of the shape function along the 
natural coordinates. 

The real space node locations are given by the matrix. 
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Here, X/ are the node points of the element in real space 
coordinates along the grid directions. The Jacobian ma- 
trix is calculated as 
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The space derivatives of the shape function in terms of 
the natural coordinates are given by, 



5N^ 



I 5X \ 

SY 
SZ 

V / 



(11) 



The strain-displacement matrix is constructed from 
the shape function as 



5i 



6x24 



/ 6X 




dY 



6Z 





6Y 



SX 
5Z 





\ 





6N^ 
6Z 



6Y 
6X 



(12) 



The anisotropic stiffness coefficient is given by 
E 
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Here, E is the elastic stiffness coefficient and 7 is the 
Poisson ratio. 

The element stiffness matrix is and is calculated by 
Gaussian integration in the natural coordinates. 
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Here, ±(a,b,c) are the node point coordi- 
nates of the linear brick element in real space. 
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FIG. 2: (a) FEM structure assembly by element. The newly 
added element (e8) nodes have matrix element contributions 
from the 7 elements (el-7). Note that the contributing 
elements are only those that precede this element in the 
natural grid along the three directions, (b) FEM structure 
matrix assembly by node. The node in consideration is 26 
(violet star). Due to the element connectivity the node has 
interaction with its in-plane surrounding nodes and also 
the layers above and below this node (green stars). Each 
node has a total of 26 connected nodes within the body of 
the structure. The number of connected element and nodes 
vary at the boundary. These boundary nodes and elements 
are assembled in a similar procedure with appropriate 
connectivity. 



Assembly 

Once the element stiffness matrices are calculated for 
each of the element, the structure stiffness matrix is con- 
structed by assembling the element stiffness matrices. 
Assembling the structure stiffness matrix is the main bot- 
tleneck in achieving high performance in parallel com- 
putation. We used a natural coordinate numbering for 
the FEM grid (X varying fastest, then Y and then Z). 
Thus element nodes varying along X direction are near- 
est neighbor in the global node numbering. In this grid 
numbering method, the FEM stencil for linear brick el- 
ements are widely separated on distributed processors 
for large number of degrees of freedom (DOF) as shown 
in Fi^ The element connectivity is shown in Fi^2|a). 
Here, the added element 8 is connected to other 7 ele- 
ments that are prior to this element in grid numbering 
system. Note that the connectivity is only backward, 
meaning the element stiffness matrix of eS only depends 
on the nearest neighbors el-e7. Elements that are added 
after e8 to the global grid are not essential for calculat- 
ing the stiffness matrix of eS. Since each element spans 
2 layers, any node has nearest neighbors in the adjacent 
layers in all the directions. The node connectivity for the 
FEM stencil is shown in Fig[2|b). A specific node un- 
der consideration is labeled 26 with a violet star sign. It 
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has 26 connected nodes in 8 elements that contribute to 
different columns of the row associated with this node. 
This type of multilayer connectivity of elements makes it 
very difficult to assemble the structure matrix avoiding 
non-local assignments. In fact, with our model of ele- 
ment wise assembly process, even a 32x32x3 elements 
global grid required about 120 seconds to assemble on 
moderate 4 processors. The poor performance of the el- 
ement wise assembly originates from the fact that differ- 
ent nodes associated with the nearest neighbor element 
belong to different processors (Fig[2|b)). The communi- 
cation burden between processors overwhelms the com- 
putation benefit even for a small global grid size. Hence, 
we discarded the element wise assembly process and re- 
sorted to a node wise assembly process. The essential 
idea is to reduce the processor to processor communi- 
cation as much as possible during global stiffness ma- 
trix assembly even at the cost of increased computation 
within individual processors. We calculate the element 
matrix elements for eight elements that are connected 
to a specific node locally. Theoretically this amounts to 
calculating the matrix element for each element 8 times 
assuming the worst case scenario where each node of an 
element belongs to a different processor. However prac- 
tically, for large global matrices with natural coordinate 
numbering, the 8 nodes of an element belong to only 2 
processors. The 8 nodes are divided into bottom and top 
layers, each containing 4 nodes and owned by individual 
processors. This amounts to calculation of the element 
stiffness matrices only twice instead of eight times. How- 
ever, even with this double calculation, the node wise as- 
sembly process makes the whole element stiffness matrix 
calculation and assembly local to individual processors. 
Thus linear scaling performance can be achieved in the 
global matrix calculation and assembly process for ar- 
bitrarily large structures. Note that in comparison, the 
element wise assembly shows poor scaling performance 
even for 4 processors for a moderately small grid size. 
The algorithm for implementing the node wise assembly 
is as follows: i) Determine the nodes owned by each pro- 
cessor and iterate through them, ii) At each iteration, 
determine the global nodes associated with the 8 near- 
est neighbor elements of the node and write the global 
indices of the 27 nearest neighbor nodes in an (27 x 3) 
array. This array contains the column numbers of the 
global matrix where the matrix elements will be writ- 
ten, iii) At each iteration, determine the global element 
number of the 8 elements associated with the node and 
iterate through them iv) For each element determine to 
what node of the element is the node of interest (26) con- 
nected v) Iterate through the eight nodes of the element 
vi) Fetch the values from the row of element matrix that 
corresponds to the node number determined in step iv 
and add the values. 

Thus all the assignments in constructing the global 
matrix are local. In this manner, we avoid nonlo- 



cal assignment of adding values to the structural ma- 
trix and achieve more than two orders of magnitude 
boost in assembly process even for moderate size of the 
grid(32x 32x32x3) compared to direct calculation of el- 
ement matrices. 

Body Force 

The polarization causes an eigen strain in the unit cell. 
The eigen strain is included in the elastic equation as a 
body force. First, the eigen strain is calculated using the 
electrostriction coefficients. The body forces due to the 
eigen strain for each element are calculated and assigned 
to the element nodes. The eigen strain and the body 
forces are given by 

^% = QaaPiPj 

Ff=f [Bf[C]cUn (15) 

Here, Qij are the electrostriction coefficients. 
Next, the elastic equilibrium is calculated from 

Kdui = (16) 

Here, K is the structure stiffness matrix including the 
appropriate boundary condition and dui is the displace- 
ment. 

Boundary Conditions 

The boundary conditions are applied by making rele- 
vant changes to the matrix that contains information for 
structure stiffness. For example, in the thin film struc- 
ture, the regularly used boundary condition is periodic 
along Xy, clamped at the bottom and stress free at the 
top. The periodic boundary condition can be applied 
during assembly process assuming the elements at the 
left boundary to be the neighbor for elements on the right 
boundary. The clamped boundary condition is appHed 
by zeroing all the values in the matrix corresponding to 
the bottom interface leaving the diagonal entries as 1. A 
specific displacement can be assigned to these nodes by 
setting these values on the right hand side. The stress 
free boundary condition is similarly applied by zeroing 
the forces at the corresponding nodes along the appro- 
priate direction. 

Once these modifications due to the boundary condi- 
tions are employed to the matrix and the right hand side 
vector, the resulting total displacement is calculated by 
solving the Kdu = f equation. Again we use an itera- 
tive Krylov subspace solver for solving the set of linear 
equations. 
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Electrostatics Calculation 



Electric field due to the inhomogeneous polariza- 
tion and the applied field incorporating the appropriate 
boundary condition is calculated on the FD grid. The 
charge density resulting from the inhomogeneous polar- 
ization and the resulting potential are calculated using 



p = V -P 



(17) 
(18) 
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Different boundary conditions can be easily applied when 
assembling the laplacian operator in 3D. Some of the use- 
ful boundary conditions are periodic, Dirichlet, Neumann 
and inhomogeneous material interface. The applied field 
is incorporated by assigning a predefined value that is 
a Dirichlet boundary condition where the electrodes are 
placed. The fioating boundary or the Neumann bound- 
ary condition is applied on open surfaces by setting the 
normal component of the electric field to zero. 



FIG. 3: (a) The FEM stiffness matrix assembly time as 
a function of the number of processors used with varying 
grid size, (b) The total cycle time for 6 iterations including 
the nonlocal electrostatic and elastic interactions as a 
function of the number of processors used with varying grid 
size, (c) Using a non-zero initial guess from the last time 
step solution, for the linear solver during the electrostatic 
and elastic interaction calculation improves the overall 
performance of the by a factor of 2 for all number of pro- 
cessors used, (d) Calculation of the long range interactions 
every 5 steps compared to every single step, improved the 
performance by about 4 times. This does not change the 
physical results since, the long range interactions usually 
act at low frequency compared to the short range interac- 
tion. For all the three structure sizes, we obtain linear scaling. 



Time Integration 

We tested both the explicit forward-euler and the ve- 
locity verlet algorithm. The latter algorithm allowed for 
an order of magnitude longer time steps compared to the 
direct forward euler method for stiff problems. In order 
to check the accuracy of the results, we simulated the re- 
sults presented in Ref. [20 . Both equilibrium and kinetic 
results were tested. With reduced space discretization 
length of Inm and reduced time step size of 0.01, we re- 
produced the results presented in Ref. [20 . We used the 
same space and time discritization length for the results 
reported here. The velocity- verlet integration algorithm 
implemented is given by the equations. 

5t 2\ St 5t J 

pr+^=p^'+^At (19) 



Numerical Performance 

The parallel performance of the 3D phase-field code 
was analyzed on Hopper of the NERSC facility. The 
Hopper machine is equipped with 24 2.1 GHz processors 
and 32 GB memory per node. It uses Gemini intercon- 
nect for inter node communication that has a latency of 
~1 /is. In FIG [sj^a) we show the parallel performance 
of the FEM assembly algorithm. Three different sizes of 
the grid are assembled from a micron to a quarter micron 
square device. The smallest structure is assembled using 
264 processors taking 57 seconds. When 1056 proces- 
sors are used, the assembly time reduces to 16 seconds. 
Thus we achieve a nearly linear scaling for the assem- 
bly process. Similar scaling behavior is also achieved for 
larger structures (up to micron size) using more number 
of processors (up to 5k) as shown in FIGjsJa). For small 
deformations of the structures during the switching pro- 
cess, it is sufficient to work with the initial FEM stiffness 
matrix at subsequent time steps. Hence, we separate the 
assembly process and cycle time for each iteration of the 
self-consistent phase-field calculation. Each time step of 
the phase-field calculation consists of calculating the lo- 
cal bulk energy, domain wall energy and the non-local 
electrostatic and elastic energv. In FIG ^h), we show 



7 



the total cycle time of 6 iterations. For the smallest grid 
(same as FIG|3|a)), the average time for a single time 
step is 7.1 s with 264 processors. The cycle time re- 
duces to 1.7 s when 1056 processors are used. Again, 
the computation time per cycle scales linearly with in- 
creasing number of processors. The larger grid sizes with 
increasing number of processors show similar scaling be- 
havior. For calculation of non-local interactions during 
time stepping, if we initialize the linear solver with the 
result of the last time step, the number of iteration re- 
quired for the linear solver used in both electrostatic and 
elastic calculation, decreases significantly. We gain an 
overall two fold performance benefit for all structure sizes 
tested across different number of processors as shown in 
FIG.|3|c)). We also found that, when simulating the spa- 
tially extended dynamics of a ferro-electric structure, the 
short range interactions gives rise to stiffness. However, 
the long range interactions usually have a longer tempo- 
ral wavelength. Hence, for fixed size of the time step, it 
is reasonable to sample the long range interactions ev- 
ery few sample of the short range dynamics. Hence, we 
calculate the long range interaction once after every 5 
steps of the short range interaction calculation. We find 
that this approximation does not change the physical re- 
sults obtained. However, the overall cycle time reduces 
by about 4 fold as shown in FIG.jsj^d). Thus significant 
improvement in scale can be achieved by a parallel imple- 
mentation of the phase-field model. Reaching the micron 
scale could enable direct comparison of simulation results 
with experimental observation that we discuss next. 



RESULTS 

In this section, we show the calculation results of fer- 
roelectric switching and domain pattern evolution on the 
001 surface of BFO and emphasize the importance of 
electrical boundary condition on the observed switching 
pattern. The presented simulation results have the di- 
mension of 1056nmx 1056nmx32nm. The energy was 
normalized so that dimensionless space-time are obtained 
following Ref. [22]. The spatial grid size is 1 nm and 
time step size is 0.005. The thermodynamic parameters 
for BFO were obtained from Ref. ^23^. A generic de- 
vice structure incorporating the multiferroic material is 
shown in FIG.[TJa). The device shows the ferroic mate- 
rial is coherently strained by the substrate. An in-plane 
electric field is applied using the two in-plane electrodes. 
The electrostatic boundary condition of the film surface 
is controlled by placing a metal or dielectric or simply 
leaving it open. This representative device structure is a 
prototype of many recent experimentally reported ferro- 
electric devices. 
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FIG. 4: Evolution of the polarization on the (001) surface 
under short circuit boundary condition, (a) The initial do- 
main pattern with left (light blue) and up (red) polarization 
domains, (b) Nucleation of right polarization domain (yel- 
low) through 71° switching of up domains, (c) 71° switched 
domain (yellow) grows. (d,e) A new domain grows towards 
south (deep blue) and eventually switches the whole domain. 
The global polarization switches by 180° in the process, (f) 
Experimental observation of 180° switch of the domains under 
a short circuit boundary condition (from Ref. [2i] l 

In this particular case, a metallic boundary condition 
was applied on the top (001) surface. We assumed that 
the thin film is coherently strained by the substrate. A 
low electric field was applied along the [110] direction 
that is just above the coercive field of the 71° switch. 
The evolution of the thin film BFO as a function of 
time is shown in FIG. [4j The application of an in-plane 
voltage on a BFO thin film with striped domain struc- 
ture can only generate in-plane 71° and 109° switching 
events. BFO's thermodynamic potential profile is such 
that Ec(71°) < Ec(109°). The calculated coercive field 
of the 71° switch is 420 kV/cm and for the 109° switch it 
is 490 kV/cm. Considering the as grown 71° striped BFO 
configuration represented in FIG. [4| the high saturation 
polarization ( 90 /iC/cm^) of BFO causes all of the do- 
mains to arrange in-plane in a head-to-tail configuration 
so that the dipole-dipole energy is minimized. Now, for 
an applied electric field of strength Ec(109°) < ^applied < 
Ec(71°) and directed from left to right (here along [110] 
BFO), our calculation demonstrates that the ferroelectric 
domains with an in-plane polarization oriented perpen- 
dicular to the applied electric field align first towards the 
direction of this external field (from [111] BFO to [111] 
BFO )(Fig. 4(c)). This corresponds to a 71° switching 
event due to the applied field (in-plane switching of the 
ferroelectric domain oriented antiparallel to the electric 
field would correspond to a 109° switching event, from 
[111] BFO to [111] BFO). The ferroelectric switching be- 
gins at the domain wall. Although, the simulation was 
started from a uniform polarization distribution within 
the domains, some domain regions (yellow) switch earlier 
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than the rest of the domains. This specific pattern for- 
mation during the switching process is a result of the 
long range electrostatic interaction. Non-uniform charge 
originates in the domain during switching which causes 
this electrostatic field. Thus some region of the domain 
are under a higher effective electric field compared to 
the other regions. Eventually the whole domain switches 
along the applied field direction. The switched domain 
(yellow) generates an energetically unfavorable head-to- 
head configuration at the wall. The theoretical maxi- 
mum limit of the dipole-dipole fields at these domain 
walls can reach up to 10"^ kV/cm due to the induced 
charge. Domains originally oriented antiparallel to the 
electric field (along [llljBFO) then switch in-plane by 
90° (corresponding to a second 71° switching event to 
the direction [111] BFO ) under this dipole-dipole field 
to recover the preferred head-to-tail configuration of the 
polarizations (dark blue region). The growth of a new 
domain also exhibits a pattern due to electrostatic inter- 
action. Note that, this long range electrostatic interac- 
tion is in-plane due to the metallic boundary condition 
applied on the top surface. We have described a sec- 
ond order switching that occurs in this configuration and 
compared to experiment in ref. [24] , 



Applied Field along (110) 




FIG. 5: Evolution of the polarization on the (001) surface 
with an open boundary condition, (a) The initial domain 
pattern with left (light blue) and up (red) oriented polar- 
ization domains, (b) Anisotropic growth of right oriented 
domain (yellow) through a 71° switch of the up(red) oriented 
polarization to right oriented domain (yellow). (c) The 
up (red) oriented domain grows simultaneously through 
domain wall switching of the left (light blue) domain, (d) 
Emergence of domain patterns (between red and yellow 
domains) aligned at 90° to the initial domain pattern. (e,f) 
PFM image showing the 90° switch of domain pattern under 
open circuit boundary condition (from Ref. [25j) 

In order to emphasize the important role that electrical 
boundary conditions play for the domain pattern reorga- 
nization during lateral electrical switching, we simulated 
the same device structure with open boundary condition. 
A Neumann boundary condition was applied on the top 
surface with the top 5 layers as air. Due to the differ- 



ence in dielectric constant between BFO (e=100) and air 
(e=l), most of the electric field go through air. The 3D 
electrostatic interaction makes the growth of the domains 
anisotropic. When an open circuit boundary condition 
is applied, a completely different result in terms of the 
domain pattern is obtained as shown in FIG. [5] We in- 
troduced two nucleation centers in the initial domain in 
order to study how these nucleated domains grow when 
an electric field is applied. These switched regions (yel- 
low) cause a charged domain walls with the initial do- 
main (red). The charges at the domain wall is reduced 
by growth of the switched domain (yellow) perpendicular 
to the initial domain wall as shown in FIG.jsJb). When 
the growing domain (yellow) enters the left oriented (light 
blue) region (FIG.jsJc)), the polarizations in the two do- 
mains (light blue and yellow) are head to head. In this 
region, the yellow domain grow perpendicular to the elec- 
trodes rather than perpendicular to the initial domains. 
However, due to domain wall switching, most of the re- 
gion is up directed (red). Hence, the growth of the newly 
formed domain (yellow) is mostly perpendicular to the 
initial domain wall. Overall, we observe a new domain 
pattern consisting of red and yellow regions that is at 
90° to the initial domain pattern as shown in FIG.|5|d). 
Experimentally this type of switching has been observed 
in planar switching of BFO with open circuit boundary 
condition [25]. Note that the domain walls have recon- 
structed and the width of the domains with polarization 
along the applied field is higher. This 90° reorientation 
of the domain pattern with an applied field is solely due 
to the anisotropic growth of domains that occur at the 
domain walls. 

Applied Field along [HQ] 
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FIG. 6: Evolution of the polarization on the (001) surface 
with an open boundary condition without considering the 
domain wall charge, (a) The initial domain pattern, (b) 
Isotropic growth of right oriented polarization domain 
(yellow) through an 71° switch of the up polarization (red). 
(c,d) Gradual isotropic growth of the switched domain 
(yellow) due to the applied field. The emergent domains that 
do not have a specific stripe like pattern since the effect of 
charge was ignored. 



If we ignore the charges at the domain walls, then the 
individual domains grow isotropically as shown in Fig [6] 
Due to the isotropic growth, the reconstructed pattern do 
not show any specific topography also the domain wall 
speed is significantly slow. This result is unphysical, since 
the presence of electrical charge should make the pattern 
stripe like. 
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Applied Field Along [100] Direction 

Study of temporal evolution of ferroelectric domain 
switching is facilitated by recent advent of in-plane ca- 
pacitor geometry and PFM analysis. There are a num- 
ber of recent experimental report on the temporal evo- 
lution of the ferroelectric domain under a lateral applied 
electric field [26, 27 . Here, we show the simulation re- 
sult of ferroelectric domain switching on the (001) plane 
of BFO when the applied field is along the [100] direc- 
tion, as shown in Figj?] An open circuit boundary con- 
dition was applied in the same manner as described in 
the previous section. The device geometry corresponds 
to that of [W. Similar switching mechanism applies to 
that reported in [27^. The simulated device dimension is 
1056nm x 264nm x 32nm. 
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FIG. 7: Evolution of the polarization on the (001) surface 
with an open boundary condition when a field is applied 
along the [100] direction, (a) The initial domain pattern 
with a defect introduced where the switching starts (dark 
blue and red dot), (b) Anisotropic growth of right oriented 
polarization domain (dark blue and red) through an 71° 
switch of the left polarization (light blue and yellow) along 
the applied field direction. (c) The anisotropic growth 
continues and switches regions close to the electrode. Slow 
growth perpendicular to the applied field and retention of 
the domain size matches very well with the experimental ob- 
servation. (d,e) Experimental data showing the intermediate 
stage between switching (from Ref. [26] 1. 

There are two initial domain variants on the (001) sur- 
face. The [111] (light blue) and [111] (yellow). A single 
nucleation site was introduced as a nucleation point in 
the [111] domain where the nucleated polarization is ori- 



ented along the [111] direction. A voltage of -75 V was 
applied on the right electrode while the left electrode 
was grounded. In this case, the switching initially oc- 
curs at the nucleation center due to strained walls where 
the [111] (light blue) polarizations switch along the [111] 
(dark blue) direction and [111] (yellow) domains switch 
along [111] (red). The initial switching is isotropic as 
shown in Fig. [7|^a). However, as the switching domain 
grow, the walls that are parallel to the electrodes are 
charged (due to head to head polarization configuration) . 
On the other hand, the walls perpendicular to the elec- 
trodes are uncharged but strained. In subsequent switch- 
ing steps, the charged domain wall grows significantly 
faster than the sidewise strained wall. The incorpora- 
tion of inhomogeneous electric field in the model capture 
this physical process that occur due to the creation of 
charged domain walls. The inhomogeneous electric fields 
due to the domain wall charges terminate at the elec- 
trode. The sign of the charges are such that they aid 
the applied field along the applied field direction. Hence 
the polarizations are under a higher effective field along 
the applied field direction in the regions where striped 
domain region is formed. Anisotropic domain growth oc- 
curs due to this effective electric field as shown in Fig. 
^h). Once a narrow stripe has been created due to for- 
ward domain growth, the domain walls perpendicular to 
the electrodes are charge compensated. The wall growth 
velocity is significantly slower in the sidewise direction 
than in the forward direction. At this stage, domains 
grow only perpendicular to the electrodes Qc)). 

Applied Field [lOO; 




(b) 




Applied Field [100] 
.0 



FIG. 8: Evolution of the polarization on the (001) surface 
with an open boundary condition when a field is applied 
along the [100] direction without considering the domain 
wall charge, (a) The initial domain pattern with a defect 
introduced where the switching starts, (b) Isotropic growth 
of right oriented polarization domain (dark blue) through 
a 71° switch of the left polarization (light blue) along the 
applied field direction. 



The anisotropic growth of the domain wall occurs due 
to the creation of charge at the switched domain wall. If 
we do not consider the domain wall charge, then the do- 
main growth is isotropic and a circular domain is created 
due to the isotropic domain wall energy. The domain 
growth in this case is shown in Fig. [Sj 
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Domain Reconstruction 

One interesting aspect of the anisotropic switching pro- 
cess is that in this case no domain wah reconstruction 
takes place. The coercive field for switching at the do- 
main wall is in fact same as that for the strained sidewise 
domain switching. However, the nature of the domain 
wall switch is that at the domain wall, polarization from 
one domain switches towards the polarization in the other 
domain. For an applied field that is perfectly symmet- 
ric with respect to the two polarization domains, there 
occurs a frustrated condition for switching from initial 
to the final polarization direction as the probability of 
switching in either directions are the same. Thus we find 
that for an applied field that is orthogonal to the domain 
walls cannot move the walls during switching and no do- 
main size reconstruction takes place. On the contrary, 
when the applied field is not perpendicular to the domain 
wall, the applied field is not symmetric with respect to 
the polarizations in the adjacent domains. One direction 
of the DW switching becomes preferable compared to the 
other. The DW propagates deterministically in a specific 
direction and the domain size reconstructs. This type of 
DW switching and domain size reconstruction is shown 
in Fig. |4] It is important to note that the suppres- 
sion of DW switching occurs due to the same coercive 
field of the two domain variants at the wall. So if the 
saturation polarization in the two domains are different, 
for example due to anisotropic strain, then the coercive 
field for DW switching will also be different for the two 
adjacent domains. In this case, an applied field perpen- 
dicular to the domain wall will also cause DW switching 
and consequently DW movement. However, depending 
on the relative magnitude of the coercive fields for the 
two processes, it is possible to find an angle with re- 
spect to the domain wall where the coercive field for the 
two processes are the same. In brief, it is possible to 
find a direction for the applied field such that during the 
switching process, the DW switching is locked and no 
domain size reconstruction takes place irrespective of the 
anisotropic strain introduced by the substrate. For real 
applications, domain size control could be an important 
design parameter. 

CONCLUSION 

In summary, we have reported a massively parallel time 
domain phase field model. We have achieved nearly lin- 
ear scaling for all the steps in the calculation. The near 
perfect scaling allows us to simulate the switching dy- 
namics of micron scale devices. Especially, the nonlinear 
response of multi-domain ferroelectric films caused by ar- 
bitrary electrostatic and mechanical boundary conditions 
can be easily studied and direct comparison with the ex- 
perimental results can be made. Here we have shown 



the results of lateral electric field switching incorporat- 
ing the multiferroic material BFO and compared the ob- 
tained results with multiple experimental results. We 
have particularly emphasized the importance of charge 
driven domain growth mechanism that explains the tem- 
poral evolution of ferroelectric domains observed in ex- 
periments. We have also elucidated the role of electro- 
static boundary condition on the lateral multi-domain 
ferroelectric switching. Finally, we predict a method of 
domain size control during ferroelectric switching which 
could find important applications in device design using 
these materials. We believe that the model will be useful 
in predicting device operation at the length scale where 
a lot of current experiments are being performed. 

This work supported in part by Nanoelectronic Re- 
search Initiative (NRI) and National Science Founda- 
tion (NSF). The computer simulations were performed 
at NERSC. 
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